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Abstract 

Can the spatial distance between two identical particles be explained 
in terms of the extent that one can be distinguished from the other? Is the 
geometry of space a macroscopic manifestation of an underlying micro- 
scopic statistical structure? Is geometrodynamics derivable from general 
principles of inductive inference? Tentative answers are suggested by a 
model of geometrodynamics based on the statistical concepts of entropy, 
information geometry, and entropic dynamics. 

1 Introduction 

The purpose of dynamical theories is to predict or explain the changes observed 
in physical systems on the basis of information that is codified into what one 
calls the states of the system. One common view is that these dynamical theories 
- the laws of physics - are successful because they happen to reflect the true 
laws of nature. 

Here I wish to follow an alternative path: perhaps once the relevant infor- 
mation has been identified the question of predicting changes is just a matter 
of careful consistent manipulation of the available information. If this turns 
out to be the case, then the laws of physics should follow directly from rules 
for processing information, that is, the rules of probability theory £Q and the 
method of maximum entropy (ME) [2J-01- 1 

There are some indications that this point of view is worth pursuing. Indeed, 
thermodynamics is a prime example of a fundamental physical theory that can 
be derived from general principles of inference [2] ■ Quantum mechanics provides 
a second, less trivial, and less well known example UJ. Both theories follow 
from a correct specification of the subject matter, that is, an appropriate choice 

1 On terminology: The ME method is designed for processing information to update from 
a prior probability distribution to a posterior distribution. (The terms 'prior' and 'posterior' 
are used with similar meanings in the context of Bayes' theorem.) The ME method is usually 
understood in the restricted sense that one updates from a prior distribution that happens to 
be uniform - this is the usual postulate of equal a priori probabilities. Here we adopt a broader 
meaning that includes updates from arbitrary priors and which involves the maximization of 
relative entropy. Since all entropies are relative to some prior, be it uniform or not, the 
qualifier 'relative' is redundant and will henceforth be omitted. For a brief account of the ME 
method in a form that is convenient for our current purposes see 



of variables - this is the truly difficult step - plus probabilistic and entropic 
arguments. 

A third independent clue is found when one attempts to derive classical 
dynamical theories from purely entropic arguments. The surprising outcome 
is that the resulting "entropic" dynamics (ED) shows remarkable similarities 
with the general theory of relativity - geometrodynamics (GD). The general 
purpose of this paper is to take the first tentative steps towards explaining 
geometrodynamics as a form of entropic dynamics. 

The procedure to derive an ED involves three steps 0. The first step is to 
identify the subject matter and the corresponding space of observable states or, 
perhaps more appropriately, the space of macrostates. This is not easy because 
there exists no systematic way to search for the right macrovariables; it is a 
matter of taste and intuition, trial and error. 

The second step is to define a quantitative measure of the change or the 
"distance" from one state to another. Although in general the choice of distance 
is not unique an exception occurs when the macrostates can be interpreted 
as probability distributions over some appropriate space of microstates. Then 
there is a natural distance which is given by the Fisher-Rao information metric 
(its uniqueness is discussed in ^HIM; f° r a brief heuristic derivation 
see m]). It measures the extent to which one probability distribution can be 
distinguished from another. This second step - assigning a statistical distance 
- is not straightforward either: more inspired guesswork is needed unless the 
right microstates happen to be known beforehand. 2 

The third and final step is easier. We ask: Given the initial and the final 
states, what trajectory is the system expected to follow? The question implicitly 
assumes that there is a trajectory, that in moving from one state to another 
the system will pass through a continuous set of intermediate states, and that 
information about the initial and final states is sufficient to determine them. 
The answer follows from a principle of inference, the ME principle, and not 
from any additional "physical" postulates. 

The resulting ED is elegant and not trivial: the system moves along a 
geodesic but the geometry of the space of states is curved and possibly quite 
complicated. Since the only available clock is the system itself there is no refer- 
ence to an external physical time. The natural intrinsic time is defined by the 
change of the system itself - in ED time is change - and can only be obtained 
after the equations of motion are solved. ED is a timeless Machian dynamics 
and its features resemble those advocated by Barbour it is reversible; it can 
be derived from a Jacobi action principle rather than the more familiar action 
principle of Hamilton; and its canonical Hamiltonian formulation is an example 
of a dynamics driven by constraints. 

The similarities to GD are striking. For example, in GD there is no reference 
to an external physical time. The proper time interval along any curve between 
an initial and a final three-dimensional geometries of space is determined only 

2 The recognition that spaces of probability distributions arc metric spaces has nevertheless 
been fruitful in statistics, where the subject is known as Information Geometry 1 1 .'ill 141 . and 
in physics 1151 — 118^ . 
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after solving the Einstein equations of motion [201 ■ The absence of an external 
time has been a serious impediment in understanding GD because it is not 
clear which variables represent the true gravitational degrees of freedom 21 - 
|2~1] . GD is also derived from a Jacobi action principle [23] |2H1 and its canonical 
Hamiltonian formulation is an example of a dynamics driven by constraints |27|- 
|29| . The question, therefore, is whether GD is an example of ED. The answer 
requires identifying those variables that describe the true degrees of freedom of 
the gravitational field. 

The tentative steps of making assumptions about the subject matter, the 
macrostates, and about how to associate a probability distribution to each of 
them are taken in section 2. We want to predict the evolution of the three- 
dimensional geometry of space. The problem is that space is invisible. What 
we see is not space, but matter in space and we do not quite know how to 
disentangle which properties should be attributed to the matter and which to 
space. The best one can do is to choose the simplest form of matter: a substance 
that is neutral to all interactions and is itself describable by a minimal number 
of attributes. This ideal form of matter is a dust of identical particles; being 
neutral they will only interact gravitationally, and being identical the issue of 
what it is that distinguishes them - size, mass, flavor - does not arise. Thus we 
assume there is nothing to space beyond what can be learned from observing 
the evolving distribution of dust particles. The geometry of space is just the ge- 
ometry of all the distances between dust particles. Furthermore, we assume this 
geometry is of statistical origin. Identical particles that are close together are 
easy to confuse, those that are far apart are easy to distinguish. The distance 
between two neighboring particles is the distinguishability distance given by a 
Fisher-Rao metric. Notice that the Fisher-Rao metric is used in two conceptu- 
ally different ways. One is to distinguish successive states of the same system, 
the other is to distinguish different neighboring particles. The first is related to 
time, the second to space. 

Having decided what system is under study and how it is statistically de- 
scribed we can proceed to define its ED. In section 3, as a warm up problem, 
we develop the ED of a single point, and then, in section 4, we generalize to the 
whole dust cloud. Although the resulting statistical GD is not Einstein's GD of 
space-time - an indication that the states and variables we have chosen do not 
accurately describe the gravitational degrees of freedom - it is close enough to 
be encouraging. The model GD developed here corresponds to what is called 
an ultralocal or strong gravity theory (H0j-|S3- We do not recover the notion 
of space-time but we do find an embryonic form of Lorentz invariance in that 
simultaneity is relative. Finally, in section 5 we summarize our conclusions. 

2 The Geometry of a Dust Cloud 

Consider a cloud of identical specks of dust suspended in an otherwise empty 
space. And there is nothing else; in particular, there are no rulers and no clocks, 
just dust. Our goal is to study how the cloud evolves. We do this by keeping 
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track of individual specks of dust. 

Being identical the particles are easy to confuse. The only distinction be- 
tween two of them is that one happens to be here while the other is over there. 
To distinguish one speck of dust from another we assign labels or coordinates 
to each particle. We assume that three real numbers (y 1 ,?/ 2 , y 3 ) are sufficient. 

Of course, particles can be mislabeled. Then the "true" coordinates y are 
unknown and one can only provide an estimate, x. Let p(y\x)dy be the proba- 
bility that the particle labeled x should have been labeled y. The labels x are 
introduced to distinguish one particle from another, but can we distinguish a 
particle at x from another at x + dx? If dx is small enough the corresponding 
probability distributions p{y\x) and p(y\x + dx) overlap considerably and it is 
easy to confuse them. We seek a quantitative measure of the extent to which 
these two distributions can be distinguished. 

The following crude argument is intuitively appealing. Consider the relative 
difference, 

p(y\x + dx) -p(y\x) _ dlogp(y\x) ^ 
p(y\x) dx 1 

The expected value of this relative difference does not provide us with the desired 
measure of distinguishability: it vanishes identically. However, the variance does 
not vanish, 

dA 2 = / *y Pm '-^^l '-^^l dx^ ^ 7ij (x)dx*dx> . (2) 

This is the measure of distinguishability we seek. Except for an overall multi- 
plicative constant, the Fisher-Rao metric 7^ is the only Riemannian metric that 
adequately reflects the underlying statistical nature of the abstract manifold of 
the distributions p(y\x) [TU ) [TT ] . 

We take the further step of interpreting dX as the spatial distance of the 
three-dimensional space the dust inhabits. Indeed, one would normally say that 
the reason it is easy to confuse two particles is that they happen to be too close 
together. We argue in the opposite direction and explain that the reason the 
particles at x and at x + dx are close together is because they are difficult to 
distinguish. 

The origin of the uncertainty will be left unspecified; perhaps it is due to 
a limit on the ultimate resolution of observation devices, or perhaps, as with 
a particle undergoing Brownian motion, the uncertainty might be caused by a 
fluctuating physical agent. It is required, however, that two particles at the 
same location in space must be affected by the same uncertainty, the same 
irreducible noise. Then the noise is not linked to the particle, but to the place, 
and we might as well say that the source of the irreducible noise is space itself. 
This is somewhat analogous to the principle of equivalence: it is the fact that 
all particles irrespective of their mass move along the same trajectories in a 
gravitational field that allows us to eliminate the notion of a gravitational field 
and attribute their common behavior to a single universal agent, the curvature 
of space-time. 
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To assign an explicit p{y\x) and explore the geometry it induces we will 
consider what is perhaps the simplest possibility. We assume that the un- 
certainty in the coordinate x is small so that p(y\x) is sharply localized in a 
neighborhood about x and within this very small region curvature effects can 
be neglected. Further, we assume that particles are labeled by the expected 
values (y l ) = x % and that the information that happens to be necessary for 
the purpose of prediction of future behavior is given by the second moments 
((y l — x l ){y^ — x^)) = C u (x). This is physically reasonable: for each parti- 
cle we have estimates for its position and of the small margin of error. Then 
p(y\x) can be determined maximizing entropy relative to an appropriate prior. 
To the extent that curvature effects are negligible, the underlying space is flat 
and translationally invariant. Thus, symmetry suggests a uniform prior and the 
resulting ME distribution is Gaussian, 



CV2 

P{VIX) = (2^72 6XP 



(3) 



where CV,- is the inverse of the covariance coefficients C 1 -? , C lk Ckj = Sj, and 
C = det Cij . The corresponding metric is obtained substituting into eq. @ ■ 
For small uncertainties Cij(x) is constant within the region where p{y\x) is 
appreciable and the result is 

lij (x)=C ij (x). (4) 

The metric changes smoothly over space and, in general, space is curved. The 
connection, the curvature, and other aspects of its Riemannian geometry can 
be computed in the standard way. The probability distributions, 



PiVlX) = (2^)372 6XP 



(5) 



also vary smoothly with x. 

To summarize, we have succeeded in describing the information geometry 
that derives from considerations of distinguishability among particles. The idea 
is rather general but was developed explicitly only for the special case of small 
uncertainties, that is, for particles that can be localized within regions much 
smaller than those where curvature effects become appreciable. An interest- 
ing question that will not be addressed here concerns the extension to those 
situations of extreme curvature found near singularities. 

Before discussing dynamics we mention that there is one very peculiar feature 
of the distance rfA, eq.(0), that may be very significant: d\ 2 is dimensionless. 
The metric 7^ (x) allows one to measure spatial lengths in terms of a local 
standard, the local uncertainty width. This immediately raises the question of 
how to compare the uncertainty widths, and therefore lengths, at two distant 
locations. One possibility, which we pursue in the rest of this paper, is that 7^ 
describes the Riemannian geometry of space. This amounts to asserting that the 
uncertainty widths are the same everywhere, they provide us with a universal 
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standard of length. A second, more intriguing possibility, which we will explore 
elsewhere, is that all the information metric 7^ allows us to do is to compare 
the lengths of small segments in different orientations at the same location; it 
allows one to measure angles. Then 7^ does not describe the geometry of space 
completely, it only describes its conformal geometry. 



3 Entropic Dynamics of a Single Point 

In this section we develop the ED of a single Gaussian distribution, an analogue 
of GD in zero spatial dimensions. Let T be the space of states. The points in T 
are Gaussian distributions with zero mean (y) =0, 

7 1/2 ( 1 A 

P(f |7) = (2 ^ 3 /2 ex P ^-gTtffVJ . (6) 

where 7 = det7^ and y = (y , y 2 ,y 3 ) are points in i? 3 . Whether 7 denotes the 
matrix 7^ or its determinant should, in what follows, be clear from the context. 
Since 7^ = 7 ^ is symmetric, T is a six dimensional space. 

The following notation is convenient: the derivative d/dj^ of a function 
F(-f) is defined so that dF takes the simple form 

,„ def dF 

dF = — dlij . (7) 



dF/d^f^ coincides with the usual partial derivative times (l + <5y)/2. To operate 
with d/d-fij we only need to find out how it acts on j kl and on its inverse j kl . 
We find 

dli , \ {Sis{ + 5\5{) d = f 5% and g = -i (7^ + 7 fe V<) ■ (8) 



Note that 5^7^ = 7jy an d ^kil kl — 7 y - We will also need to differentiate the 
determinant 7 = det 7^ , 

dl = 11 u dli , or — =1 fi. (9) 
a lij 

The Fisher-Rao metric g % i kl on the space T is 

g*" = J dy P (vh) dl0 ^ h) ' 7) = i ^ + ^ ■ < 10 > 

and its inverse metric, defined by g % i kl gu mn = <5 m „ i i s 

.9fe( mn = 7fcm7in + 7fcn7(m ■ (U) 

Now we can tackle the dynamics. The key to the question "Given initial 
and final states, what trajectory is the system expected to follow?" lies in 
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the implicit assumption that there exists a continuous trajectory. This means 
that large changes are the result of a continuous succession of very many small 
changes; the problem of studying large changes is reduced to the simpler problem 
of studying small changes. 

We want to determine the states along a short segment of the trajectory as 
the system moves from an initial state 7 to a neighboring final state 7 + A7. To 
find the intermediate states we reason that in going from the initial to the final 
state the system must pass through a halfway point, that is, an intermediate 
state that is equidistant from 7 and 7 + A7. Finding the halfway point clearly 
determines the trajectory: first find the halfway point, and use it to determine 
'quarter of the way' points, and so on. But there is nothing special about halfway 
states. In general, we can assert that the system must pass through intermediate 
states 7^ such that, having already moved a distance d£ away from the initial 
7, there remains a distance u>d£ to be covered to reach the final 7 + A7; lo is 
any positive number. 

The basic dynamical question can be rephrased as follows: The system is 
initially described by the probability distribution p(y\j) and we are given the 
new information that the system has moved to one of the neighboring states in 
the family p(y| 7aJ )- Which p(2/|7 w ) do we select? Phrased in this way it is clear 
that this is precisely the kind of problem to be tackled using the ME method. 1 
The selected distribution is that which maximizes the relative entropy of p{y\^ UJ ) 
relative to a prior distribution p id- Since in the absence of new information there 
is no reason to change one's mind, when there are no constraints the selected 
posterior distribution should coincide with the prior distribution. Therefore the 
prior p id is the initial state p(y\j). Thus, to determine the intermediate state 
7 W = 7 + c?7 one varies over dq^ to maximize 

.p{y\i + di) 



r 

s\p(yhJ,p{vh)} = - / d yP (y\-/ + dj)\og- 



p(y\i) 



= -\9 llkl dl tl d lkl =- l -d£\ (12) 
subject to the constraint d£f = wd£ where 

d£} = </« kl (A 7ij - d 7ij ) (A lkl - d lkl ) . (13) 
Introducing a Lagrange multiplier A/2, 

^-y kl d^d lkl + ±(^df-d£}) 



= 5 



then, the selected dj^ is given by 



(14) 



dlij=X^lij where x = 1 + x ^ _ ^ ■ ( 15 ) 

Substituting d-y^ into d£ and d£f we get d£ = \A£ and d£j = (1 — x)A£, so 
that x = (1 + with < \ < 1 and 

d£ + d£ f = A£. (16) 
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The interpretation is clear: the three states 7, 7^ and 7 + A7 lie on a straight 
line. The expected trajectory is the geodesic that passes through the given 
initial and final states. 

Note that each different value of u> provides a different criterion to select 
the trajectory and an inconsistency would arise if these criteria led to different 
trajectories. It is reassuring to find that indeed the ED trajectory is independent 
of the value u>. 

ED determines the vector tangent to the trajectory dj/d£, but not the ac- 
tual velocity d'y/dt. In conventional forms of dynamics the distance I along the 
trajectory is related to an external time t through a Hamiltonian which fixes 
the evolution relative to external clocks. But here the only clock available is 
the system itself which can only provide an internal, intrinsic time. It is best to 
define the intrinsic time so that motion looks simple. A natural definition con- 
sists in stipulating that the system moves with unit velocity, then the intrinsic 
time is given by the distance I itself. The intrinsic time interval is the amount 
of change. A peculiar feature of this notion of time is that intervals are not a 
priori known, they are determined only after the equations of motion are solved 
and the actual trajectory is determined. 

The geodesies in the space Y are obtained minimizing the Jacobi action 

J[ 7 ] = 1,(7,7), (17) 

where 77 is an arbitrary parameter along the trajectory and 7^ = dj^/drj. The 
Lagrangian is just the arc length 

£(7,7) = (g l3kl %%i) 1/2 = QyV^) 172 . (is) 

The canonical momenta are 

* = = ^ 7 7 lij kl = ^ 7 7 113 ' ( 9) 

and have a fixed magnitude 

g ijk i7T ij 7r kl = 1. (20) 
The canonical Hamiltonian vanishes identically 

ff C an(7,<)=7^-£(7,7) = 0, (21) 

because the Lagrangian is homogeneous of first degree in the velocities. The 
manifest reparametrization invariance of the action J [7] conveniently reflects 
the absence of an external time with respect to which the system could possibly 
evolve. 

Since variations of the momenta are constrained to preserve their magnitude 
the action principle is 

tt, N] = dr, [j^i - N(r,)h(-r, tt)] , (22) 

Jr h 
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where 

hh,*) = \ 9lJ M^K kl -\, (23) 

and N(rj) are Lagrange multipliers that at each instant r\ enforce the constraints 

/i( 7 ,7t) = 0. (24) 

Equations of motion are obtained varying with respect to 7 and 7r with 7 
fixed at the endpoints §~/ij(%) = SjijiVf) = 0- Then 

dh 

imn = N Q^ = ^miV V (25) 

<% 

^m™ _ _ N = -27V7 y 7r ml 7r ly . (26) 

There is no equation of motion for TV. Comparing ea. (|19|) and l|25|) we get 

N{n)=L(w) = ^ t (27) 

which is recognized as the "lapse" function which gives the increase of intrinsic 
time I per unit increase of the parameter rj. Then the equations of motion 
simplify to 

%p = J^ = ^ mi i n y j (28) 

M dir mn " m ,nj v ; 

J mn 31, 

V - -j£~»r,^~. (») 

One can check that dh/d-q = 0. Therefore if /i = initially, the constraint 
will be consistently preserved by the evolution. One can also check that the 
action I[y, ir, N] is invariant under the gauge transformations 

^=6(1)^. 5*™ = -eW-jj^-, and 5iV = efo) (30) 

provided e(r]) vanishes at the end points, e(?7j) = £(fjf) — 0. The invariance 
81 = holds for any path 7(7?), 7r(?y) and not just for those paths at which the 
action is stationary. In addition, as is evident in the action J [7], there is an ad- 
ditional invariance under global (^-independent) "conformal" transformations, 
7y — > V^T-y- The corresponding conserved quantity is tr7r. To appreciate the 
significance of this conserved quantity note that 

trvr - 7 ^ m „ - ^ - ^ ^ - ^ dr , (31j 

so that the determinant 7 expands or contracts at a constant relative rate. In 
particular, if the initial velocity happens to be such that tr tt — 0, then 7 remains 
fixed at its constant initial value. 
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4 Geometrodynamics: the Ultralocal Case 



The system we study is a single dust cloud. To the dust cloud we associate a 
probability distribution P given by a product of the distributions, eq.(|SJ) of the 
individual particles, 

p [y\i\ = Y[p{y( x )\x,i l] (x)) 



n 



7 x / 2 (z) 
(2tt) 3 / 2 



exp 



(32) 



It was the necessity to quantify whether we can distinguish a test particle 
at x from its neighbor at x + dx that led us to introduce the metric 7^ in the 
first place. When we consider the change from an earlier state 7 to a later state 
7 + A7 the distinguishability problem manifests itself yet again. Even if we had 
managed to distinguish a test particle at x from a neighboring test particle at 
x + dx, there is no guarantee that the particle that earlier had coordinates x will 
be the same particle that will later be found at x. Particles do not just need 
to be identified, they need to be re-identified. For the invisible points of space 
this difficulty is only exacerbated because the re-identification of points depends 
on the state of motion of the test particles. If we allow for the possibility of 
particles moving past each other we conclude that the points of space cannot 
be treated as enduring things. And this is precisely where the model discussed 
in this section becomes unrealistic: we maintain such a strict correspondence 
between a test particle and the point it occupies that we end up treating the 
individual points of space as if they were real enduring objects. A more realistic 
model of space should deal with several potentially coexisting dust clouds in 
relative motion. 

Once a dust particle in the earlier state 7 is identified with the label x, we 
will assume that this particle can be assigned the same label x as it evolves into 
the later state 7 + A7. These are commoving coordinates. Then we can write 
the change A£ between P[j/|7 + A7] and P[y\"f], ea . I|32|) . from their relative 
entropy, 



5[7 + A 7 , 7] = - J (n dy(x)^j P[y\j + A 7 ] log 



PM7 + A7] 1 2 

P[y\i] 2 ( j 



Since P[y|7] and P[y\j + A7] are products S[j + A7, 7] can be written as a sum 
over the individual particles, 



S[ 7 + A 7 , 7] = E S M X ) + A ^)> = - \ E , 



(34) 



where 



A£ 2 (x)= 9 v kl (x)A lzi (x)A lkl (x) 



(35) 
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with g v kl given by ea. pi)|) . Therefore, the overall change in going from 7 to 
7 + A7 is 

At 2 =J2A£ 2 {x) = f dxp(x)A£ 2 (x), (36) 

x •* 

where we have written the discrete sum as an integral - the number of dust 
particles within dx is dxp(x). 

Having given a sufficient specification of what we mean by a state of the 
system we can now proceed to formulate its ED. Once again we ask, 'Given 
initial and final states, what trajectory is the system expected to follow?' and 
the answer follows from the implicit assumption that there exists a continuous 
trajectory, but here we must pay closer attention to what precisely we mean 
by 'trajectory'. Indeed, if predicting changes is just a matter of careful con- 
sistent manipulation of the available information, then we must recognize that 
we know more than just that the product state eq.(|22J) must evolve through a 
continuous sequence of intermediate states. We also know that each and every 
one of the individual factors p(y\x,j) must also evolve continuously through 
a sequence of intermediate states to reach the corresponding final state. This 
means that instead of one parameter uj there are many such parameters, one 
for each position x, and there is no reason why they should all take the same 
value. In other words, the intermediate states 7^ should be labeled by a local 
function lu(x) rather than a single global parameter lu. A continuous sequence 
of states 7^ interpolating between the initial 7 and the final 7 + A7 can be 
defined by imposing uj(x) = (f(x) where f(x) is a fixed positive function and 
the parameter £ varies from to 00. There is no single trajectory; each choice 
of the function f(x) defines one possible trajectory. In a sense, the cloud follows 
many alternative paths "simultaneously" . To guarantee consistency we should 
check that physical predictions are independent of the choice of the arbitrary 
function f(x). 

Before we formulate the ED we should remark on the significance of invari- 
ance under choices of f(x). The product state -P[y|7] provides the only definition 
of what an instant is, of which states p(y\x' , 7') at distant points x' we can agree 
to call simultaneous with a certain state p(y\x, 7) at the point x. Therefore, if 
there is no unique sequence of intermediate states, then there is no unique, ab- 
solute definition of simultaneity. We see here a kind of foliation invariance, a 
rudimentary, and yet extreme form of local Lorentz invariance. Since the metric 
7 U of the intermediate states P[y|7(J remains positive for arbitrary choices of 
the function lj(x) the analogues of the light cones are collapsed into light lines. 
The invariant speed - the speed of light - is zero. The GD model described 
here resembles the so-called ultralocal or strong gravity theories |Hni _ IS2 more 
closely than it resembles general relativity. 

Now we address the question: Given initial and final states, 7 and 7 + A7, 
what are the possible trajectories? Let 77 be an arbitrary time parameter labeling 
successive intermediate states. The initial state is 7^(77,2;) = 7^(2;), the final 
state is 7^(17 + Arj,x) = Jij{x) + A / y i j(x), and the intermediate states are 
Jiji 7 ! + di], x) — Jij(x) + d , y i j(x). To determine the intermediate state 7 + d"/ 
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one varies over drfij to maximize the entropy 

S[j + d 7 , 7] = - J (u dy(x)) P[y\j + d 7 ] log = > (37) 

where 

de 2 = J dx P (x)d£ 2 (x) with d£ 2 (x) = ff i3 ' w (a;)*yy(a!)d7 W (s), (38) 

subject to independent constraints at each point x, 

d£f(x) = uj(x)d£(x) (39) 

where 

dl 2 {x) = gV»(x) (A 7y (z) - d-fyix)) (A lkl (x) - d lkl {x)) (40) 
Introducing Lagrange multipliers A(x)/2, 







dxp(x) i -\dl 2 {x) + ^ (uj 2 (x)d£ 2 (x) - dt)(x)) 



(41) 



the result, d 7 y(x) = x(x)A , y i j(x), coincides with the single point result, ea. (|15fl 
for each value of x. Substituting d^^ into d£(x) and dif(x) we get d£(x) — 
xA£(x) and dlf(x) = [1 - x{x)]^£(x), so that 

d£(x)+«K/(a;)=AI(x). (42) 

The conclusion is that the states of the individual particles evolve independently 
of each other along geodesies in the single point configuration space given by 
cas. (l28l29H . The dynamics of the cloud is independent of the choice of u>(x) as 
desired - this is foliation invariance. 

The ultralocal statistical GD deduced in the previous paragraphs is the dy- 
namics of a large or perhaps infinite number of independent subsystems. The 
action for the whole cloud can be written as the sum of the individual particle 
actions given in ea. (|17f) . Thus, the proposed action is 

J[l, 7] = f dr, Jdxp (g» k %j kl ) 1/2 , (43) 

where 7^ = dj^/dr/. In commoving coordinates p = dp/drj = 0. It is straight- 
forward to develop the constrained Hamiltonian formalism and recover the single 
particle equations of motion. 

Notice that the actual distance from the initial state to the final state along 
a certain path is given by ea. (|36|l . 



£ = 



f I .,\ 1/2 pf 
dr] U 2 ) = dr) 



dxpg tlkl ^ l A kl 



1/2 



(44) 



Therefore, unlike the action for a single point, ea. (|17|l . the action l|43|l is not 
the natural arc length. The dust cloud does not evolve along a geodesic. The 
reason for this can be traced to the additional constraint that individual particles 
evolve continuously, which allows a multitude of different trajectories and leads 
to foliation invariance. 
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5 Conclusions 



One idea explored in this work is whether it is possible to establish a connection 
between ordinary spatial distances and the information metric of Fisher and 
Rao - whether one can explain the notion of spatial distance. We succeeded 
in describing the information geometry that derives from considerations of dis- 
tinguishability among particles; particles that are easily confused arc said to 
be near, those that are easily distinguished are farther apart. The idea is that 
distances between particles are not distances between structureless points but 
distances between probability distributions. 

According to Euclid, a point is that which has no size. General relativity 
was founded upon a revision of Euclid's fifth postulate. Statistical geometro- 
dynamics is founded upon the further revision of Euclid's first definition, the 
notion of structureless points. 

The second idea we explored is whether Einstcinian macroscopic geometro- 
dynamics is derivable from an underlying microscopic statistical theory purely 
on the basis of principles of inference, without additional postulates of a more 
"physical" nature. We can only claim a partial success; the result is close enough 
to be promising. The model GD we obtained satisfies the main requirement, it 
describes the dynamics of a geometry; it is related to gravity because it describes 
an ultralocal gravity theory; and it exhibits foliation invariance. Moreover, the 
somewhat puzzling fact that space and time are so different and yet enter the 
formalism in such a symmetric way receives a natural explanation: a time inter- 
val refers to the extent we can distinguish an earlier state from a later state of 
the same system, while a spatial distance refers to the extent we can distinguish 
two different systems. 

Einstein's GD might be recovered by making a different choice of the states 
and variables that describe the gravitational degrees of freedom. Two possible 
alternative choices were suggested. First, one should avoid a too strict corre- 
spondence between a test particle and the point it occupies because this treats 
the individual points of space as if they were real objects. Second, it may be 
that the Fisher-Rao metric does not describe the full geometry of space, as we 
assumed in this work, but only describes its conformal geometry. 

Should the ideas proposed here prove successful one can further expect that 
the currently popular approaches to a quantum theory of gravity will require 
revision. 
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